3,771 research outputs found

    Solar activity detection and prediction using image processing and machine learning techniques

    Get PDF
    The objective of the research in this dissertation is to develop the methods for automatic detection and prediction of solar activities, including prominence eruptions, emerging flux regions and solar flares. Image processing and machine learning techniques are applied in this study. These methods can be used for automatic observation of solar activities and prediction of space weather that may have great influence on the near earth environment. The research presented in this dissertation covers the following topics: i) automatic detection of prominence eruptions (PBs), ii) automatic detection of emerging flux regions (EFRs), and iii) automatic prediction of solar flares. In detection of prominence eruptions, an automated method is developed by combining image processing and pattern recognition techniques. Consecutive Hu solar images are used as the input. The image processing techniques, including image transformation, segmentation and morphological operations are used to extract the limb objects and measure the associated properties. The pattern recognition techniques, such as Support Vector Machine (SVM), are applied to classify all the objects and generate a list of identified the PBs as the output. In detection of emerging flux regions, an automatic detection method is developed by using multi-scale circular harmonic filters, Kalman filter and SVM. The method takes a sequence of consecutive Michelson Doppler Imager (MDI) magnetograms as the input. The multi-scale circular harmonic filters are applied to detect bipolar regions from the solar disk surface and these regions are traced by Kalman filter until their disappearance. Finally, a SVM classifier is applied to distinguish EFRs from the other regions based on statistical properties. In solar flare prediction, it is modeled as a conditional density estimation (CDE) problem. A novel method is proposed to solve the CDE problem using kernel-based nonlinear regression and moment-based density function reconstruction techniques. This method involves two main steps. In the first step, kernel-based nonlinear regression techniques are applied to predict the conditional moments of the target variable, such as flare peak intensity or flare index. In the second step, the condition density function is reconstructed based on the estimated moments. The method is compared with the traditional double-kernel density estimator, and the experimental results show that it yields the comparable performance of the double-kernel density estimator. The most important merit of this new method is that it can handle high dimensional data effectively, while the double-kernel density estimator has confined to the bivariate case due to the difficulty of determining optimal bandwidths. The method can be used to predict the conditional density function of either flare peak intensity or flare index, which shows that our method is of practical significance in automated flare forecasting

    Discovery of Novel Glycogen Synthase Kinase-3beta Inhibitors: Molecular Modeling, Virtual Screening, and Biological Evaluation

    Get PDF
    Glycogen synthase kinase-3 (GSK-3) is a multifunctional serine/threonine protein kinase which is engaged in a variety of signaling pathways, regulating a wide range of cellular processes. Due to its distinct regulation mechanism and unique substrate specificity in the molecular pathogenesis of human diseases, GSK-3 is one of the most attractive therapeutic targets for the unmet treatment of pathologies, including type-II diabetes, cancers, inflammation, and neurodegenerative disease. Recent advances in drug discovery targeting GSK-3 involved extensive computational modeling techniques. Both ligand/structure-based approaches have been well explored to design ATP-competitive inhibitors. Molecular modeling plus dynamics simulations can provide insight into the protein-substrate and protein-protein interactions at substrate binding pocket and C-lobe hydrophobic groove, which will benefit the discovery of non-ATP-competitive inhibitors. To identify structurally novel and diverse compounds that effectively inhibit GSK-3â, we performed virtual screening by implementing a mixed ligand/structure-based approach, which included pharmacophore modeling, diversity analysis, and ensemble docking. The sensitivities of different docking protocols to the induced-fit effects at the ATP-competitive binding pocket of GSK-3â have been explored. An enrichment study was employed to verify the robustness of ensemble docking compared to individual docking in terms of retrieving active compounds from a decoy dataset. A total of 24 structurally diverse compounds obtained from the virtual screening experiment underwent biological validation. The bioassay results shothat 15 out of the 24 hit compounds are indeed GSK-3â inhibitors, and among them, one compound exhibiting sub-micromolar inhibitory activity is a reasonable starting point for further optimization. To further identify structurally novel GSK-3â inhibitors, we performed virtual screening by implementing another mixed ligand-based/structure-based approach, which included quantitative structure-activity relationship (QSAR) analysis and docking prediction. To integrate and analyze complex data sets from multiple experimental sources, we drafted and validated hierarchical QSAR, which adopts a multi-level structure to take data heterogeneity into account. A collection of 728 GSK-3 inhibitors with diverse structural scaffolds were obtained from published papers of 7 research groups based on different experimental protocols. Support vector machines and random forests were implemented with wrapper-based feature selection algorithms in order to construct predictive learning models. The best models for each single group of compounds were then selected, based on both internal and external validation, and used to build the final hierarchical QSAR model. The predictive performance of the hierarchical QSAR model can be demonstrated by an overall R2 of 0.752 for the 141 compounds in the test set. The compounds obtained from the virtual screening experiment underwent biological validation. The bioassay results confirmed that 2 hit compounds are indeed GSK-3â inhibitors exhibiting sub-micromolar inhibitory activity, and therefore validated hierarchical QSAR as an effective approach to be used in virtual screening experiments. We have successfully implemented a variant of supervised learning algorithm, named multiple-instance learning, in order to predict bioactive conformers of a given molecule which are responsible for the observed biological activity. The implementation requires instance-based embedding, and joint feature selection and classification. The goal of the present project is to implement multiple-instance learning in drug activity prediction, and subsequently to identify the bioactive conformers for each molecule. The proposed approach was proven not to suffer from overfitting and to be highly competitive with classical predictive models, so it is very powerful for drug activity prediction. The approach was also validated as a useful method for pursuit of bioactive conformers

    Implementation of Multiple-Instance Learning in Drug Activity Prediction

    Get PDF
    In the context of drug discovery and development, much effort has been exerted to determine which conformers of a given molecule are responsible for the observed biological activity. In this work we aimed to predict bioactive conformers using a variant of supervised learning, named multiple-instance learning. A single molecule, treated as a bag of conformers, is biologically active if and only if at least one of its conformers, treated as an instance, is responsible for the observed bioactivity; and a molecule is inactive if none of its conformers is responsible for the observed bioactivity. The implementation requires instance-based embedding, and joint feature selection and classification. The goal of the present project is to implement multiple-instance learning in drug activity prediction, and subsequently to identify the bioactive conformers for each molecule. We encoded the 3-dimensional structures using pharmacophore fingerprints which are binary strings, and accomplished instance-based embedding using calculated dissimilarity distances. Four dissimilarity measures were employed and their performances were compared. 1-norm SVM was used for joint feature selection and classification. The approach was applied to four data sets, and the best proposed model for each data set was determined by using the dissimilarity measure yielding the smallest number of selected features. The predictive abilities of the proposed approach were compared with three classical predictive models without instance-based embedding. The proposed approach produced the best predictive models for one data set and second best predictive models for the rest of the data sets, based on the external validations. To validate the ability of the proposed approach to find bioactive conformers, 12 small molecules with co-crystallized structures were seeded in one data set. 10 out of 12 co-crystallized structures were indeed identified as significant conformers using the proposed approach. The proposed approach was demonstrated to be highly competitive with classical predictive models, hence it is very powerful for drug activity prediction. The approach was also validated as a useful method for pursuit of bioactive conformers
    • …
    corecore